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ABSTRACT 

Model  checking  techniques  applied  to  large  industrial  circuits  suf¬ 
fer  from  the  state  space  explosion  problem.  A  major  technique 
to  address  this  problem  is  abstraction.  The  most  commonly  used 
abstraction  technique  for  hardware  verification  is  localization  re¬ 
duction,  which  removes  latches  that  are  not  relevant  to  the  prop¬ 
erty.  However,  localization  reduction  fails  to  reduce  the  size  of  the 
model  if  the  property  actually  depends  on  most  of  the  latches.  This 
paper  proposes  to  use  predicate  abstraction  for  verifying  RTL  Ver¬ 
ilog,  a  technique  successfully  used  for  software  verification.  The 
main  challenge  when  using  predicate  abstraction  is  the  discovery 
of  suitable  predicates.  We  propose  to  use  weakest  pre-conditions 
of  Verilog  statements  in  order  to  obtain  new  predicates  during  ab¬ 
straction  refinement.  This  technique  has  not  been  applied  to  circuits 
before.  On  benchmarks  taken  from  an  industrial  microprocessor, 
we  successfully  verified  safety  properties  with  more  than  32,000 
latches  in  the  cone  of  influence.  We  compare  the  performance  of 
our  technique  with  a  modern  model  checker  that  implements  local¬ 
ization  reduction. 

Categories  and  Subject  Descriptors:  B. 5. 2  [Hardware]:  Register- 
Transfer-Level  Implementation-Design  Aids;  J.6  [Computer  Aided 
Engineering] :  [Computer-Aided  Design] 

General  Terms:  Verification 

Keywords:  Predicate  Abstraction,  Verilog,  SAT 

1.  INTRODUCTION 

Model  checking  [8]  is  one  of  the  most  commonly  used  formal 
verification  techniques  in  a  commercial  setting.  However,  model 
checking  suffers  from  the  state  space  explosion  problem.  One  prin¬ 
cipal  method  in  state  space  reduction  is  abstraction.  Abstraction 
techniques  reduce  the  state  space  by  mapping  the  set  of  states  of 
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the  actual,  concrete  system  to  an  abstract,  and  smaller,  set  of  states 
in  a  way  that  preserves  the  relevant  behaviors  of  the  system. 

In  the  hardware  domain,  the  most  commonly  used  abstraction 
technique  is  localization  reduction  [16,  21,  5].  The  abstract  model 
is  created  from  the  given  circuit  by  removing  a  large  number  latches 
together  with  the  logic  required  to  compute  their  next  state.  The 
latches  that  are  removed  are  called  the  invisible  latches.  The  latches 
remaining  in  the  abstract  model  are  called  visible  latches.  The  ini¬ 
tial  abstract  model  is  created  by  making  the  latches  present  in  the 
property  as  visible,  and  the  rest  as  invisible. 

Localization  reduction  is  a  conservative  over-approximation  of 
the  original  circuit  for  reachability  properties.  This  implies  that 
if  the  abstraction  satisfies  the  property,  the  property  also  holds  on 
the  original,  concrete  circuit.  The  drawback  of  the  conservative 
abstraction  is  that  when  model  checking  of  the  abstraction  fails, 
it  may  produce  a  counterexample  that  does  not  correspond  to  any 
concrete  counterexample.  This  is  usually  called  a  spurious  coun¬ 
terexample. 

In  order  to  check  if  an  abstract  counterexample  is  spurious,  the 
abstract  counterexample  is  simulated  on  the  concrete  machine.  This 
is  called  the  simulation  step.  Like  in  Bounded  Model  Checking 
(BMC),  the  concrete  transition  relation  for  the  design  and  the  given 
property  are  jointly  unwound  to  obtain  a  Boolean  formula.  The 
number  of  unwinding  steps  is  given  by  the  length  of  the  abstract 
counterexample.  As  in  BMC,  the  Boolean  formula  is  then  checked 
for  satisfiability  using  a  SAT  procedure  [21],  If  the  instance  is  sat- 
isfiable,  the  counterexample  is  real  and  the  algorithm  terminates.  If 
the  instance  is  unsatisfiable,  the  abstract  counterexample  is  spuri¬ 
ous,  and  abstraction  refinement  has  to  be  performed. 

The  basic  idea  of  the  abstraction  refinement  techniques  is  to  cre¬ 
ate  a  new  abstract  model  which  contains  more  detail  in  order  to 
prevent  the  spurious  counterexample.  This  process  is  iterated  until 
the  property  is  either  proved  or  disproved.  It  is  known  as  the  Coun¬ 
terexample  Guided  Abstraction  Refinement  framework,  or  CEGAR 
for  short  [16,  3,  6,  13,21], 

In  the  software  domain,  the  most  successful  abstraction  tech¬ 
nique  for  large  systems  is  predicate  abstraction  [14],  It  abstracts 
data  by  only  keeping  track  of  certain  predicates  on  the  data.  Each 
predicate  is  represented  by  a  Boolean  variable  in  the  abstract  pro¬ 
gram,  while  the  original  data  variables  are  eliminated.  When  ap¬ 
plying  predicate  abstraction  to  circuits,  two  problems  arise: 

•  Most  model-checkers  used  in  the  hardware  industry  work  on 
a  very  low  level  design,  usually  a  net-list.  However,  predicate 
abstraction  is  only  effective  if  the  predicates  can  cover  the  re¬ 
lationship  between  multiple  latches.  This  typically  requires  a 
word  level  model  given  in  register  transfer  language  (RTL), 
e.g.,  in  Verilog.  The  RTL  level  languages  are  similar  to  lan¬ 
guages  used  in  the  software  domain,  such  as  ANSI-C. 
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•  The  second  problem  concerns  the  use  of  theorem  provers  for 
computing  the  predicate  abstraction.  Theorem  provers  model 
the  variables  using  unbounded  integer  numbers.  Overflow  or 
bit-wise  operators  are  not  modeled.  However,  hardware  de¬ 
scription  languages  like  Verilog  provide  an  extensive  set  of  bit¬ 
wise  operators.  For  hardware  designs,  the  use  of  these  bit-level 
constructs  is  ubiquitous. 

Predicate  abstraction  tools  used  for  software  employ  multiple 
heuristics  in  order  to  reduce  the  cost  of  calling  the  theorem  prover 
while  computing  the  abstraction.  SLAM  [3]  applies  ad-hoc  heuris¬ 
tics  that  limit  the  number  of  predicates  in  a  query,  i.e.,  it  partitions 
the  set  of  predicates  into  smaller  subsets.  This  speeds  up  the  ab¬ 
straction  process,  but  the  resulting  abstraction  contains  additional 
spurious  behavior.  If  the  SLAM  toolkit  encounters  a  spurious  coun¬ 
terexample,  it  first  assumes  that  it  is  caused  by  a  lack  of  predicates, 
and  attempts  to  find  new  predicates.  If  no  new  predicates  are  found, 
SLAM  concludes  that  the  counterexample  is  caused  by  the  parti¬ 
tioning  of  the  predicates  during  the  abstraction.  In  this  case,  a  sep¬ 
arate  refinement  algorithm  (called  Constrain  [2])  is  invoked.  This 
step  only  addresses  spurious  behavior  due  to  an  inexact  abstraction, 
as  opposed  to  spurious  behavior  caused  by  insufficient  predicates. 

In  the  BLAST  tool  [15],  the  abstraction  is  completely  demand- 
driven.  Initially,  BLAST  uses  a  very  coarse  abstraction.  Additional 
abstraction  is  only  performed  when  a  spurious  counterexample  is 
encountered.  The  abstraction  is  only  done  to  the  extent  necessary 
to  remove  the  spurious  behavior.  This  is  called  lazy  abstraction. 

Contribution.  This  paper  introduces  new  techniques  for  word- 
level  predicate  abstraction  and  refinement  for  circuits  given  in  Ver¬ 
ilog  RTL.  There  are  two  challenges  when  applying  predicate  ab¬ 
straction  to  circuits:  1)  The  computation  of  the  abstract  model  is 
hard  in  presence  of  large  number  of  predicates,  and  2)  predicate 
abstraction  relies  on  a  good  predicate  discovery  algorithm. 

In  order  to  address  the  first  problem,  we  partition  the  set  of  pred¬ 
icates  into  clusters  of  related  predicates.  The  abstraction  is  com¬ 
puted  separately  with  respect  to  the  predicates  in  each  cluster.  Since 
each  cluster  contains  only  a  small  number  of  predicates,  the  com¬ 
putation  of  the  abstraction  becomes  more  efficient.  We  refer  to  this 
technique  as  predicate  partitioning.  As  a  result  of  the  partitioning, 
additional  spurious  counterexamples  are  introduced  which  have  to 
be  removed  during  the  refinement  phase.  In  this  context,  we  iden¬ 
tify  lazy  abstraction  [15]  and  eager  abstraction  [10]  as  special  cases 
of  predicate  partitioning.  The  eager  technique  refers  to  the  case 
when  all  predicates  are  within  a  single  cluster,  while  lazy  abstrac¬ 
tion  corresponds  to  the  case  in  which  none  or  very  few  predicates 
are  used  for  computing  the  abstract  model. 

As  in  [10],  we  use  SAT  to  compute  the  abstract  transition  rela¬ 
tion.  However,  the  predicate  partitioning  technique  is  also  applica¬ 
ble  with  any  other  solver  (or  theorem  prover). 

When  a  spurious  counterexample  is  encountered,  we  first  check 
whether  each  transition  in  the  counterexample  can  be  simulated  on 
the  original  program.  This  is  done  by  creating  a  SAT  instance  for 
the  simulation  of  each  abstract  transition.  If  the  SAT  instance  for 
an  abstract  transition  is  unsatisfiable,  then  the  abstract  transition 
is  spurious.  In  this  case,  we  refine  the  abstraction  by  adding  con¬ 
straints  on  the  abstract  transition  relation  which  eliminates  the  spu¬ 
rious  transition.  We  make  use  of  the  proof  of  unsatisfiability  of 
the  SAT  instance  to  extract  a  small  set  of  predicates  to  eliminate 
the  transition.  The  fewer  predicates  are  found,  the  more  spurious 
counterexamples  can  be  eliminated  in  one  step. 

When  all  instances  are  satisfiable  it  implies  that  none  of  the  ab¬ 
stract  transitions  is  spurious  due  to  the  partitioning.  The  immediate 
conclusion  then  is  that  the  spurious  counterexample  is  caused  by 


Figure  1:  Abstraction-refinement  loop  in  this  paper. 

insufficient  predicates.  In  this  case,  we  apply  a  novel  word-level 
refinement  technique:  we  compute  the  weakest  precondition  of  the 
property  with  respect  to  the  transition  function  given  by  the  circuit. 
To  the  best  of  our  knowledge,  this  is  the  first  time  syntactic  weakest 
preconditions  of  circuits  have  been  used  for  refinement  in  predicate 
abstraction.  The  overall  flow  of  the  various  techniques  described 
above  is  shown  in  Fig.  1. 

Related  work.  In  [1 1],  a  SAT-based  technique  for  predicate  ab¬ 
straction  of  circuits  given  in  Verilog  is  introduced.  The  circuit  is 
synthesized  and  transformed  into  net-list  level.  A  SAT  solver  like 
ZChaff  [17]  is  used  in  order  to  perform  the  abstraction,  which  al¬ 
lows  to  support  all  bit-level  constructs.  However,  if  refinement  be¬ 
comes  necessary,  only  bit-level  predicates  are  introduced. 

Andraus  et  al.  [1]  present  a  scheme  for  automatic  abstraction  of 
behavioral  RTL  Verilog  to  the  CLU  language  used  by  the  UCLID 
system  [4],  However,  the  abstractions  produced  by  their  approach 
can  be  coarse  as  there  is  no  direct  support  for  bit-vectors  and  bit¬ 
wise  operators  in  the  CLU  language.  Also  no  refinement  is  done 
when  a  spurious  counterexample  is  obtained. 

Outline.  In  section  2,  we  provide  the  notation  used  throughout 
the  paper.  Section  3  describes  the  SAT-based  predicate  abstrac¬ 
tion.  Techniques  for  partitioning  predicates  are  given  in  section 
4.  We  present  techniques  for  word-level  abstraction  refinement  in 
section  5.  We  report  the  experimental  results  in  section  6,  and  con¬ 
clude  the  paper  in  section  7.  The  formal  semantics  of  the  subset  of 
Verilog  we  handle  can  be  found  in  our  technical  report  [9]. 

2.  PRELIMINARIES 

Let  =  {ri,...,r„}  denote  the  set  of  registers.  The  state  of 
the  Verilog  program  is  given  by  the  valuation  of  these  registers. 
We  consider  the  external  inputs  to  be  registers  without  a  next-state 
function.  Let  Q,C  0^  denote  the  set  of  registers  that  are  not  external 
inputs,  i.e.,  have  a  next-state  function.  We  denote  the  next-state 
function  of  a  word-level  register  r ;  6  Q,  by  fi{r\  ,...,r„),  or  fi(r) 
using  vector  notation.  The  transition  relation  R(r,  ?)  relates  the 
current  state  r  £  S  to  the  next  state  f  and  is  defined  as  follows: 

R(r,r')  :=  A  ('W/M) 

r,6Q. 

Example:  Consider  a  register.*  of  size  8  bits.  In  each  clock  cycle,  if 
*  is  less  than  five,  then  the  value  of  x  is  incremented  by  two,  else  the 
value  of  *  remains  unchanged.  Thus,  the  next  state  function  of  x  is 
given  by  ((*  <  5)?(*  +  2)  :  *),  where  ?  denotes  the  choice  operator. 
Note  that  we  have  a  next  state  function  for  the  whole  register  *  and 
not  for  the  individual  bits  in  *. 

We  follow  the  counterexample  guided  abstraction  refinement  (CE- 
GAR)  framework  to  prove  the  given  property.  The  first  step  of  the 
CEGAR  loop  is  to  obtain  an  abstraction  of  the  given  program.  We 
use  SAT-based  predicate  abstraction  for  this  purpose. 


3.  PREDICATE  ABSTRACTION 

In  predicate  abstraction  [14],  the  variables  of  the  concrete  pro¬ 
gram  are  replaced  by  Boolean  variables  that  correspond  to  a  predi¬ 
cate  on  the  variables  in  the  concrete  program.  These  predicates  are 
functions  that  map  a  concrete  state  f  6  S  into  a  Boolean  value.  Let 
B  =  (7ti, . .  •  ,itk}  be  the  set  of  predicates  over  the  given  program. 
When  applying  all  predicates  to  a  specific  concrete  state,  one  ob¬ 
tains  a  vector  of  Boolean  values,  which  represents  an  abstract  state 
b.  We  denote  this  function  by  a (r).  It  maps  a  concrete  state  into  an 
abstract  state  and  is  therefore  called  an  abstraction  function. 

We  perform  an  existential  abstraction  [7],  i.e.,  the  abstract  model 
can  make  a  transition  from  an  abstract  state  b  to  V  iff  there  is  a 
transition  from  r  to  ¥  in  the  concrete  model  and  r  is  abstracted  to 
b  and  ?'  is  abstracted  to  b' .  We  call  the  abstract  machine  T,  and  we 
denote  the  transition  relation  of  T  by  R. 

R  :=  {(b,b')\3r,¥  <E  S:  R(r,¥)A  _ 

a(r)  =  iAa(r')  =  b'} 

The  initial  state  I(r)  is  abstracted  as  follows: 

1(b)  :=  £  S  :  (oc(r)  =  b  )  A  I(r) 

The  abstraction  of  a  safety  property  P(r)  is  defined  as  follows:  for 
the  property  to  hold  on  an  abstract  state  b ,  the  property  must  hold 
on  all  states  r  that  are  abstracted  to  b. 

P(b)  :=  Vr£S:  (a (r)  =  b)  =>■  P(r) 

Thus,  if  P  holds  on  all  reachable  states  of  the  abstract  model,  P  also 
holds  on  all  reachable  states  of  the  concrete  model. 

SAT-basecl  predicate  abstraction.  In  [10],  a  SAT  solver  is 
used  to  compute  the  abstraction  of  a  sequential  ANSI-C  program. 
This  approach  supports  all  ANSI-C  integer  operators,  including  the 
bit- vector  operators.  We  use  a  similar  technique  for  computing  the 
abstraction  of  the  Verilog  programs.  A  symbolic  variable  b ;  is  asso¬ 
ciated  with  each  predicate  it,-.  Each  concrete  state  r  =  {r\,. ..  ,rn} 
maps  to  an  abstract  state  b  =  {b\ , . . .  ,bf\,  where  bj  =  7l,(f).  If 
the  concrete  machine  makes  a  transition  from  state  r  to  state  f  = 
(r'j then  the  abstract  machine  makes  a  transition  from  state 
b  to  b'  =  {b\ , . . . ,  b'k},  where  b \  =  7t/(f'). 

The  formula  that  is  passed  to  the  SAT  solver  directly  follows 
from  the  definition  of  the  abstract  transition  relation  R  as  given 
in  equation  1.  The  set  of  abstract  transitions  R  is  computed  by 
transforming  equation  1  into  conjunctive  normal  form  (CNF)  and 
passing  the  resulting  formula  to  a  SAT  solver.  The  satisfying  as¬ 
signments  obtained  form  the  abstract  transition  relation  R. 
Example:  Let  the  transition  relation  R(x,y,x!  ,y')  be  x1  =  y  A  /  = 
x.  Let  the  set  of  predicates  be  {x  =  l,y  =  1}.  The  equation  for 
computing  the  R  is  given  as  follows: 

3x,y,x',y' :  (b\  <$(x=  1))A  (b2  (y  =  1))A 
R(x,y,x!,y')  A  (b\  <=>  (x'  =  1))  A  (b'2  ^  (/  =  1)) 

The  set  of  satisfying  assignments  to  the  above  equation  results  in 
R  :=  ((b\  O  b2)  A  (b'2  <&h)). 

Note  that  the  predicates  used  for  abstraction  can  be  arbitrary 
Boolean  expressions  allowed  by  the  Verilog  syntax.  Thus,  the  pred¬ 
icates  can  involve  operators  for  concatenation,  extraction  etc.  For 
example,  a  [  3  :  0  ]  >7,  ram  [  {addr ,  1 '  b0}  ]  ==d  [9:2]  are  al¬ 
lowed  as  predicates. 

4.  PREDICATE  PARTITIONING 

We  call  the  computation  of  the  exact  existential  abstraction  as 
described  in  the  previous  section  the  Eager  approach.  In  the  worst 
case,  the  number  of  satisfying  assignments  is  exponential  in  the 
number  of  predicates.  In  practice,  computing  abstractions  using 


the  eager  approach  can  be  very  slow  even  for  a  small  number  of 
predicates. 

The  speed  of  the  abstraction  computation  can  be  improved  if 
we  do  not  aim  at  the  the  most  precise  abstract  transition  relation. 
That  is,  we  allow  our  abstraction  to  be  an  over-approximation  of 
the  abstract  transition  relation  generated  by  the  eager  approach. 
The  SLAM  toolkit,  for  example,  limits  the  number  of  predicates 
in  each  theorem  prover  query.  Thus,  the  set  of  the  predicates  and 
their  next-state  state  versions  is  partitioned  into  smaller  sets  of  re¬ 
lated  predicates.  We  call  these  sets  clusters,  and  denote  them  by 
Ci,..., Ci,  with  Cj  C  {7Ii,... ,7ty,7Cj, ... where  it',  denotes  the 
next  state  version  of  it,-. 

The  equation  for  abstracting  the  transition  system  with  respect  to 
Cj  is  given  as  follows: 

3r,r' :  fo;  =  lt,  (f)  A  R(r,  f)  A  /\  b\  =  it/(f') 

n  ieCj  n'leCj 

The  satisfying  assignments  to  the  above  equation  correspond  to  the 
abstract  transition  relation  Rj,  which  is  represented  symbolically 
using  BDDs.  The  number  of  satisfying  assignments  to  the  above 
equation  is  limited  by  size  of  cluster  Cj,  that  is,  2'cn.  Clearly,  by 
limiting  the  size  of  Cj,  we  can  compute  the  abstract  transition  rela¬ 
tions  much  faster  as  compared  to  the  eager  approach. 

The  conjunction  of  /  abstract  transition  relations  R\,...,Ri  re¬ 
sults  in  the  abstract  transition  relation  R: 

l 

R  :=  f\Ri  (2) 

i=l 

We  refer  to  the  above  technique  of  partitioning  the  set  of  predi¬ 
cates  in  various  clusters,  and  using  these  clusters  for  computing  the 
abstraction  R,  as  predicate  partitioning. 

Claim.  If  Q  denotes  the  transition  relation  obtained  by  using 
the  eager  approach  (Eqn.  1 ),  and  R  denotes  the  transition  relation 
obtained  by  predicate  partitioning  (Eqn.  2),  then  Q=>  R. 

The  above  claim  is  proved  by  observing  that  for  all  1  <  j  <  /, 
Q  =>  Rj.  Thus,  R  is  an  over-approximation  of  Q,  and  hence,  a 
conservative  over-approximation  of  the  original  circuit. 

We  evaluate  two  different  techniques  for  creating  predicate  clus¬ 
ters  used  in  predicate  partitioning,  cone  partitioning  and  partition¬ 
ing  for  lazy  abstraction. 

Syntactic  cone  partitioning.  This  technique  clusters  a  next 
state  predicate  with  a  set  of  current  state  predicates  if  the  variables 
appearing  in  the  current  state  predicates  affect  the  value  of  the  next 
state  predicate. 

Example:  Let  the  transition  relation  R(x,y,xl ,/)  be  x'  =yA) /  =x. 
Let  the  set  of  predicates  be  {x  =  l,y  =  \,x!  =  l,y'  =  1}.  The  value 
of  the  predicate  /  —  I  is  affected  by  the  value  of  x  (as  /  equals  x). 
Note  that  the  value  of  y'  =  1  is  not  affected  by  the  value  of  y.  Thus, 
we  keep  x  =  1  and  /  =  1  together  in  a  cluster  C\ .  Similarly,  the 
other  cluster  C2  '■=  {y  =  i,x!  =  1}  is  obtained. 

Syntactic  partitioning  for  lazy  abstraction.  The  idea  of 
lazy  abstraction  [15]  is  to  defer  the  abstraction  until  required  by 
a  spurious  counterexample.  A  completely  lazy  abstraction  corre¬ 
sponds  to  using  no  clusters.  Thus,  the  initial  abstraction  is  simply 
true.  Motivated  by  this  idea,  we  use  a  very  inexpensive  syntac¬ 
tic  partitioning  to  compute  a  very  coarse  initial  abstraction.  This  is 
done  to  compute  initial  abstractions  of  large  circuits  quickly. 

There  are  many  ways  to  perform  a  partitioning  for  an  inexpen¬ 
sive  abstraction.  One  simple  technique  is  to  create  k  clusters,  each 
containing  exactly  one  current-state  predicate  71/ .  We  follow  a  vari¬ 
ant  of  this  technique:  all  current-state  predicates  that  contain  the 


exact  same  set  of  variables  are  kept  in  the  same  partition.  This  is 
useful  if  the  given  set  of  predicates  contains  many  mutually  exclu¬ 
sive  (or  related)  predicates  such  as  x  =  l,x  =  2,x  =  3.  Keeping 
these  predicates  in  separate  clusters  will  result  in  an  exponential 
number  of  contradicting  abstract  states,  such  as  an  abstract  state  in 
which  both  x  =  1  and  x  =  2  are  true.  The  next-state  predicates  are 
not  used  for  computing  the  abstraction. 

Example:  Let  the  set  of  current-state  predicates  be  {x  <  200,  x  = 
100, y  =  100, z  >  100}.  The  clusters  produced  for  lazy  abstraction 
are  Ci  :=  {x  <  200, x  =  100},  C2  :=  {y  =  100},  C3  :=  {z  >  100}. 

Once  the  abstraction  of  the  concrete  system  is  obtained,  we  model- 
check  it  using  the  NuSMV  model-checker  [18].  If  the  abstract 
model  satisfies  the  property,  the  property  also  holds  on  the  origi¬ 
nal,  concrete  circuit.  If  the  model  checking  of  the  abstraction  re¬ 
turns  false,  we  obtain  a  counterexample  from  the  model-checker.  In 
order  to  check  if  an  abstract  counterexample  corresponds  to  a  con¬ 
crete  counterexample,  a  simulation  step  is  performed.  If  the  coun¬ 
terexample  cannot  be  simulated  on  the  concrete  model,  it  is  called 
a  spurious  counterexample.  The  elimination  of  spurious  counterex¬ 
amples  from  the  abstract  model  is  described  in  the  next  section. 

5.  ABSTRACTION  REFINEMENT 

When  refining  the  abstract  model,  we  distinguish  between  two 
cases  of  spurious  behavior,  as  done  in  [11]:  Spurious  transitions 
are  abstract  transitions  which  do  not  have  any  corresponding  con¬ 
crete  transitions.  By  definition,  spurious  transitions  cannot  appear 
in  the  most  precise  abstraction  as  computed  by  the  eager  approach. 
However,  as  we  noted  earlier,  computing  the  most  precise  abstract 
model  is  expensive  and  thus,  we  make  use  of  the  various  parti¬ 
tioning  techniques.  These  techniques  can  typically  result  in  many 
spurious  transitions.  Spurious  prefixes  are  prefixes  of  the  abstract 
counterexample  that  do  not  have  a  corresponding  concrete  path. 
This  happens  when  the  set  of  predicates  is  not  rich  enough  to  cap¬ 
ture  the  relevant  behaviors  of  the  concrete  system,  even  for  the  most 
precise  abstraction. 

An  abstract  counterexample  is  a  sequence  of  abstract  states  s(l), 

. . . ,  s(l),  where  each  abstract  state  s(j)  corresponds  to  a  valuation 

of  the  k  predicates  Jtj . 71^.  The  value  of  71/  in  a  state  s  is  denoted 

by  Sj.  Recall  that  7t  1  denotes  the  next  state  version  of  71;.  In  order  to 
check  if  an  abstract  transition  s  to  t  can  be  simulated  on  the  concrete 
model,  we  create  a  SAT  instance  given  by  the  following  equation: 

k  k 

A  Ki  =  A  ?')  A  /\<m  ii 

i-l  i=l 

The  equation  above  is  transformed  into  CNF  and  passed  to  a  SAT 
solver.  If  the  SAT  solver  detects  the  equation  to  be  satisfiable,  the 
abstract  transition  can  be  simulated  on  the  concrete  model.  Other¬ 
wise,  the  abstract  transition  is  spurious. 

Removing  spurious  transitions.  If  the  abstract  transition  is 
spurious,  the  CNF  instance  is  unsatisfiable.  In  this  case,  we  make 
use  of  the  ZChaff  SAT  solver  [17]  for  finding  a  subset  of  clauses  in 
the  CNF  instance  which  is  also  unsatisfiablefcalled  an  unsatisfiable 
core).  It  is  computed  by  making  use  of  the  proof  of  unsatisfiability 
of  the  SAT  instance  [22],  We  use  the  unsatisfiable  core  to  determine 
a  subset  of  existing  predicates  (7t ;)  which  are  sufficient  to  show 
that  the  abstract  transition  is  spurious.  The  spurious  transition  is 
removed  from  the  abstract  model  by  adding  a  constraint  in  terms  of 
the  predicates  appearing  in  the  unsatisfiable  core. 

Example:  Consider  the  abstract  transition  from  j=  {b\  =  0,b2  = 
1}  to  t  =  {b\  =  0,R  =  0},  where  b\,  b2  represent  the  current  state 
values  and  b\ ,  b'-,  represent  the  next  state  values  of  predicates  x>2, 
y  =  3,  respectively.  Let  the  next  state  functions  be  x!  =  y,  y'  =  x. 


Observe  that  in  s,  the  predicate  y  =  3  is  true.  This  implies  that 
y d  =  3,  and  thus,  must  hold  in  t.  However,  b\  is  false  in  t  and 
thus,  the  transition  from  s  to  t  is  spurious.  This  transition  can  be 
eliminated  by  adding  the  constraint  -i {-<b\  /\b2f\  -<b\  A  —<£>(> )  to  the 
abstract  model.  However,  this  constraint  removes  just  one  spurious 
transition.  By  making  use  of  an  unsatisfiable  core,  we  can  make  the 
constraint  more  general,  thereby  eliminating  many  spurious  transi¬ 
tions  at  the  same  time.  In  this  example,  the  cause  of  the  spurious 
behavior  is  due  to  b2  =  1,  and  b\  =  0.  The  unsatisfiable  core  allows 
us  to  discover  this  fact.  Now  we  can  eliminate  this  abstract  transi¬ 
tion  and  many  more  spurious  transitions  by  adding  the  following 
constraint  to  the  abstract  model:  ->{b2  A  ). 

Removing  spurious  prefixes.  In  [1 1],  the  elimination  of  spu¬ 
rious  prefixes  is  done  by  adding  a  monolithic  bit-level  predicate.  In 
contrast  to  that,  we  make  use  of  weakest  preconditions  as  done  in 
software  verification.  We  generate  new  word-level  predicates  from 
the  weakest  pre-condition  of  the  given  property  with  respect  to  the 
transition  function  given  by  the  RTL  level  circuit. 

Weakest  pre-conditions  for  Verilog.  In  software  verifica¬ 
tion,  the  weakest  pre-condition  wp(st,fi)  of  a  formula  y  is  usually 
defined  with  respect  to  a  statement  st  (e.g.,  an  assignment).  It  is  the 
weakest  formula  whose  truth  before  the  execution  of  st  entails  the 
truth  of  y  after  st  terminates.  In  case  of  hardware,  each  state  transi¬ 
tion  can  be  viewed  as  a  statement  where  the  registers  are  assigned 
values  according  to  their  next-state  functions. 

Recall  that  the  set  of  registers  that  have  a  next-state  function  is 
denoted  by  Q,.  For  example,  external  inputs  do  not  appear  in  this 
set.  The  next-state  function  for  register  r;  6  Q,  is  given  by  f(r). 
We  use  /  to  denote  the  vector  of  the  next  state  functions  for  the 
registers  in  Q,.  For  any  expression  e,  the  expression  e[x/y\  denotes 
the  simultaneous  substitution  of  each  x;  in  e  by  y;  from  y.  Note  that 
Xj  and  y,  might  themselves  be  expressions. 

The  weakest  precondition  of  the  property  y(f)  with  respect  to 
one  concrete  transition  is  defined  as  follows: 

wpi{f,y(p))  ■=  if)  [r/f] 

The  weakest  precondition  with  respect  to  i  consecutive  concrete 
transitions  is  defined  inductively  as  follows: 

wpi(f,  Y))  :=  wpfif,  wpi-i{f,  y))  (t  >  1) 

In  order  to  refine  a  spurious  prefix  of  length  /  >  0,  we  compute 
wpi(f,l),  where  t  is  the  safety  property  we  are  interested  in  check¬ 
ing.  Intuitively,  t  holds  holds  after  /  transitions  iff  wpfif,  x)  holds 
before  /  transitions.  Refinement  corresponds  to  adding  the  boolean 
expressions  occurring  in  wpj(f,z)  to  the  existing  set  of  predicates. 
The  abstraction,  created  with  respect  to  the  new  set  of  predicates, 
results  in  a  model  that  does  not  contain  this  spurious  prefix. 
Example:  Let  the  property  be  x  <  3,  and  the  next  state  function 
for  the  register  x  be  ((x  <  5)?(x  +  2)  :  x).  Suppose  we  obtain  an 
spurious  prefix  of  length  equal  to  1.  The  weakest  pre-condition 
wp i  is  given  as  (((jc  <  5)  ?  (jc  +  2)  :  x )  <  3). 

Simplifying  the  weakest  pre-conditions.  The  problem  with  the 
approach  above  is  that  the  predicates  generated  can  become  very 
complex  when  the  spurious  prefix  is  large.  This  will  adversely  af¬ 
fect  the  future  iterations  of  the  abstraction  refinement  loop.  In  soft¬ 
ware  verification,  this  problem  is  solved  by  computing  the  weakest 
pre-condition  with  respect  to  the  statements  appearing  in  the  spuri¬ 
ous  counterexample  trace  only.  This  is  not  directly  applicable  to  a 
synchronous  circuit. 

Instead,  we  apply  a  syntactic  simplification  to  the  weakest  pre¬ 
conditions  at  each  step.  The  simplification  uses  data  from  the  ab¬ 
stract  error  trace.  We  exploit  the  fact  that  many  of  the  control  flow 


guards  in  the  Verilog  file  are  also  present  in  the  current  set  of  pred¬ 
icates.  The  abstract  trace  assigns  truth  values  to  these  predicates  in 
each  abstract  state.  In  order  to  simplify  the  weakest  pre-conditions, 
we  substitute  the  guards  in  the  weakest  pre-conditions  with  their 
truth  values.  Furthermore,  we  only  add  the  atomic  predicates  in  the 
weakest  pre-condition  as  the  new  predicates. 

Example:  Suppose  the  guard  x  <  5  is  present  in  the  current  set  of 
predicates.  Let  the  value  of  x  <  5  in  an  abstract  state  s' be  true. 
The  weakest  pre-condition  given  as  (((x  <  5)  ?  (jr  +  2)  :  x )  <  3), 
can  be  simplified  in  s ,  by  substituting  the  value  of  x  <  5.  This 
results  in  a  new  predicate  x  +  2  <  3  (or  jc  <  1). 

With  weakest  pre-condition  simplification,  it  is  not  always  enough 
to  compute  the  weakest  pre-condition  of  the  given  property  for  re¬ 
finement.  One  needs  to  identify  a  subset  of  existing  predicates, 
whose  weakest  pre-condition  must  be  computed  for  removing  the 
spurious  behavior.  This  is  done  by  simulating  the  spurious  prefix 
and  picking  the  predicates  that  appear  in  the  unsatisfiable  core.  If 
a  copy  of  predicate  p  in  cycle  k  appears  in  the  unsatisfiable  core, 
then  we  compute  the  weakest  pre-condition  of  p  for  k  steps. 

6.  EXPERIMENTAL  RESULTS 

The  experiments  are  performed  on  a  1.5  GFIZ  AMD  machine 
with  3  GB  of  memory  running  Linux.  A  time  limit  of  one  hour  and 
a  memory  limit  of  700  MB  was  set  for  each  run.  We  compare  our 
technique  against  a  non-commercial  version  of  the  Cadence  SMV 
model  checker  [12].  The  Cadence  SMV  tool  is  a  net-list  based 
model  checker;  it  reads  in  a  RTL-Level  circuit  and  generates  a  next- 
state  function  for  each  latch  in  the  circuit.  In  contrast  to  that,  we 
generate  a  next  state  function  for  each  register. 

6.1  Benchmarks  and  Properties  verified 

Our  benchmarks  are  taken  from  the  Instruction  Cache  Unit  (ICU), 
and  the  Instruction  Cache  RAM  (ICRAM)  unit  of  the  Sun  PicoJava 
II  microprocessor  [20],  The  ICU  fetches  the  instructions  from  the 
instruction  cache  and  passes  them  to  the  decode  unit.  We  checked 
the  property  that  in  case  of  a  cache  read  miss  the  ICU  controller 
implementation  simulates  a  miss  state  transition  diagram  given  in 
the  picoJava-II  micro-architecture  guide  [20]. 

The  ICRAM  maintains  a  RAM  of  size  16KB  (organized  as  2048 
entries  of  64  bits)  each.  If  the  write  is  enabled  (icu_ram_we[l:0]  = 
2’blO),  then  the  value  of  data  input  (icu_din)  is  written  to  the  higher 
32  bits  of  the  location  addressed  by  the  input  address  (icu_addr). 
This  functionality  of  the  ICRAM  was  encoded  in  form  of  a  safety 
property  using  the  current  and  the  next  state  of  the  variables.  Ob¬ 
serve  that  the  property  depends  on  the  contents  of  the  RAM.  Thus, 
even  after  applying  the  techniques  such  as  localization  reduction, 
the  system  will  have  16KB  (16  x  1024  x  8)  latches.  In  order  to 
simplify  the  problem,  we  verified  the  property  for  the  RAM  of  sizes 
512  byte,  1KB,  2KB,  and  4KB.  These  benchmarks  are  denoted  as 
M512B,  M1KB,  M2KB,  M4KB,  in  the  Table  1,  respectively. 

The  benchmarks  starting  with  ”AR”  perform  arithmetic  opera¬ 
tions  on  two  registers  a  and  b  in  each  clock  cycle.  The  next  state 
functions  of  a  and  b  are  given  as  follows:  d  :=(a<  100)?(a  +  7>)  :  a 
and  b'  :=  a.  Initial  values  of  these  registers  are  1  and  0,  respectively. 
We  check  the  property  that  b  <  a  in  each  clock  cycle.  The  bench¬ 
marks  AR100,  AR200,  AR500,  AR1000  in  Table  1  are  variants  of 
this  circuit  obtained  by  increasing  the  size  of  the  registers  a  and  b. 

The  experimental  results  are  summarized  in  Table  1 .  The  ’"Latches” 
column  contains  the  total  number  of  latches  in  the  cone  of  influence 
of  the  property.  We  compare  two  different  techniques  for  verifying 
these  benchmarks.  The  columns  marked  with  ’’Predicate  Abstrac¬ 
tion”  contain  the  results  of  applying  the  predicate  abstraction  and 
refinement  techniques  discussed  in  this  paper.  The  ’’Time”,  ”Abs”, 


”MC”,  and  ”Ref’  columns  contain  the  total  time,  followed  by  the 
breakup  of  the  total  time  into  the  time  taken  by  abstraction,  model 
checking,  and  refinement  including  simulation.  The  ”P/I”  column 
contains  the  final  number  of  predicates  followed  by  the  total  num¬ 
ber  of  iterations. 

The  results  of  running  Cadence  SMV  are  given  in  the  ”CSMV” 
column.  Of  the  various  options  to  Cadence  SMV,  we  found  the 
counterexample-based  abstraction  refinement  option  -absref3 
[6]  to  result  in  the  best  performance  when  checking  the  picoJava-II 
properties.  We  report  the  total  time  taken  by  Cadence  SMV  when 
running  with  this  option. 


Bench¬ 

mark 

Latches 

Predicate  Abstraction 

CSMV 

Time 

Time 

Abs 

MC 

Ref 

P/1 

ICU 

28 

1.3 

0.6 

0.1 

0.6 

5/1 

0.1 

M512B 

4137 

107.1 

2.2 

0.8 

104.1 

3/8 

2.3 

M1KB 

8234 

180.8 

9.3 

0.8 

170.7 

3/8 

7.5 

M2  KB 

16427 

450.7 

24 

0.9 

425.3 

3/8 

25.0 

M4KB 

32796 

843.3 

37 

0.8 

805.5 

3/8 

- 

AR100 

202 

3.5 

2.8 

0.12 

0.55 

3/3 

182.4 

AR200 

402 

9.6 

8.4 

0.12 

1.1 

3/3 

2147 

AR500 

1002 

32.2 

29.3 

0.12 

2.8 

3/3 

* 

AR1000 

2002 

122.6 

116.8 

0.16 

5.6 

3/3 

* 

Table  1:  Experimental  results:  All  runtimes  are  in  seconds.  A 
indicates  a  timeout  of  1  hour.  A  indicates  the  model  checker  termi¬ 
nated  due  to  the  large  number  of  BDD  variables. 


6.2  Summary  of  Results 

On  the  ICU  benchmark,  Cadence  SMV  outperforms  predicate 
abstraction.  Since  the  state  space  of  this  benchmark  is  very  small, 
no  abstraction  is  necessary.  On  the  M512B.  M1KB.  and  M2KB 
benchmarks,  the  runtime  of  Cadence  SMV  is  better  than  the  pred¬ 
icate  abstraction  runtime.  However,  Cadence  SMV  is  not  able  to 
handle  the  M4KB  benchmark  which  has  a  much  larger  state  space. 
Cadence  SMV  timeouts  on  the  AR500  and  the  AR1000  bench¬ 
marks,  while  the  predicate  abstraction  method  is  able  to  complete 
these  benchmarks  with  better  runtimes.  Some  of  the  inferences 
drawn  from  these  observations  are  as  follows: 

•  The  runtime  of  Cadence  SMV  grows  exponentially  with  each 
newly  added  latch.  This  trend  is  visible  in  the  AR100  to  AR1000 
benchmarks.  In  these  benchmarks.  Cadence  SMV  is  not  able 
to  reduce  the  number  of  latches  in  the  abstract  model  created, 
making  the  model  checking  step  expensive. 

•  Using  predicate  abstraction  the  size  of  the  abstract  model  re¬ 
mains  constant  even  when  the  number  of  latches  are  increased. 
This  is  because  for  many  properties  the  number  of  word-level 
predicates  needed  for  proof  does  not  grow,  as  the  sizes  of  the 
registers  appearing  in  the  property  is  increased.  This  trend  is 
visible  in  the  M*  and  the  AR*  benchmarks,  where  the  number 
of  predicates  needed  to  prove  the  property  does  not  change  as 
the  number  of  latches  is  increased.  Thus,  the  model  checking 
(MC)  time  is  similar  across  all  M*  benchmarks  and  across  the 
AR*  benchmarks. 

•  The  computation  of  the  abstract  model  using  predicate  abstrac¬ 
tion  requires  the  use  of  a  decision  procedure,  which  is  a  SAT 
solver  in  our  case.  In  general,  the  problem  of  computing  the 
precise  existential  abstraction  (Eqn.  1)  is  itself  exponential  in 
the  number  of  predicates  and  the  size  of  the  transition  relation 
(number  of  latches).  However,  this  complexity  is  not  observed 
in  our  experiments  due  to  two  reasons:  1)  the  use  of  state  of 
art  SAT  solvers  like  ZChaff  [17]  and  Siege  [19],  2)  the  use  of 
our  predicate  partitioning  technique  (Sec.  4)  to  handle  the  large 
number  of  predicates.  The  experimental  results  indicate  that  the 
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Figure  2:  Runtime  of  the  predicate  abstraction  and  refinement  with  respect  to  number  of  latches:  (a)  M*  benchmarks  (b)  AR*  benchmarks 


abstraction  computation  time  does  not  grow  exponentially  with 
each  newly  added  latch. 

A  plot  of  the  total  time  needed  by  the  predicate  abstraction  tech¬ 
nique  compared  to  the  number  of  latches  is  given  in  Fig.  2(a)  and 
Fig.  2(b)  for  the  M*  and  the  AR*  benchmarks,  respectively.  Ob¬ 
serve  that  the  runtime  does  not  increase  exponentially  with  number 
of  latches.  These  experiments  support  the  hypothesis  that  the  it¬ 
erative  predicate  abstraction  and  refinement  can  scale  to  circuits 
involving  thousands  of  latches. 

Predicate  partitioning  techniques.  We  experimented  with  two 
different  techniques  for  creating  predicate  clusters  (section  4),  namely 
cone  partitioning  and  partitioning  for  lazy  abstraction.  Both  tech¬ 
niques  are  complementary  to  each  other.  Cone  partitioning  at¬ 
tempts  to  keep  all  related  predicates  together,  thus,  the  abstract 
model  produced  is  more  precise  as  compared  to  lazy  abstraction. 
Flowever,  the  time  taken  for  abstraction  using  cone  partitioning  can 
become  a  bottleneck.  In  such  cases,  lazy  abstraction  works  well  if 
the  property  can  be  checked  using  a  coarse  abstract  model.  Cone 
partitioning  is  used  for  AR*  benchmarks,  while  the  lazy  abstrac¬ 
tion  is  used  for  M*  benchmarks.  Observe  that  the  total  time  is 
dominated  by  the  abstraction  time  in  case  of  AR*  benchmarks,  and 
the  refinement  time  in  case  of  the  M*  benchmarks.  Additional  ex¬ 
periments  can  be  found  in  our  technical  report  [9]. 

Performance  on  Vapor  benchmarks.  The  Vapor  tool  [1]  performs 
abstraction  of  the  Verilog  models  to  the  CLU  language  [4]  for  veri¬ 
fication.  In  [1],  Vapor  was  used  to  verify  control  related  properties 
of  the  ITC99  circuits.  We  found  that  all  the  21  properties  of  the 
ITC-bl3  benchmark  are  proved  trivially  using  predicate  abstrac¬ 
tion.  Of  the  21  properties,  18  properties  can  be  proved  using  two 
or  less  predicates.  The  time  taken  is  less  than  one  second  and  no 
refinement  iterations  are  needed.  The  remaining  three  properties 
are  proved  in  less  than  20  seconds,  using  8  or  fewer  predicates, 
with  four  refinement  iterations.  The  other  ITC99  circuits  reported 
in  [1]  are  also  handled  in  a  straightforward  way. 

7.  CONCLUSIONS 

Localization  reduction  fails  if  the  property  depends  on  too  many 
latches.  This  requires  a  stronger  abstraction  technique.  This  pa¬ 
per  presents  novel  algorithms  for  computing  and  refining  predicate 
abstractions  of  circuits  given  in  RTL  Verilog  using  SAT. 

There  are  two  challenges  when  using  predicate  abstraction  on 
Verilog:  1)  the  computation  of  the  abstract  model,  and  2)  how  to 
obtain  good  predicates.  We  address  the  first  challenge  by  introduc¬ 
ing  predicate  partitioning ,  a  hybrid  between  lazy  abstraction  [15] 
and  eager  abstraction.  We  make  use  of  unsatisfiable  cores  of  SAT 
instances  in  order  to  eliminate  multiple  spurious  transitions  caused 
by  the  imprecise  abstraction. 


in  order  to  obtain  the  right  set  of  predicates,  we  compute  new 
word-level  predicates  by  using  weakest  pre-conditions  of  Verilog 
RTL.  Weakest  pre-conditions  are  commonly  used  in  the  software 
domain.  However,  this  technique  was  not  applied  to  hardware  be¬ 
fore,  despite  of  the  fact  that  high-level  RTL  closely  resembles  lan¬ 
guages  like  ANSI-C.  Our  experimental  results  show  that  this  tech¬ 
nique  is  very  effective  in  discovering  new  word-level  predicates  for 
refinement.  On  the  large  benchmarks,  our  new  algorithm  scales 
well  with  the  design  size  and  clearly  outperforms  existing  algo¬ 
rithms  that  use  localization  reduction. 
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